Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing boundaries #5

Merged
merged 10 commits into from Feb 22, 2017
Merged

Conversation

MartijnBraam
Copy link

This fixes the regex that checks for boundaries (The end of line detection was wrong)
It also makes the regex strip the quotes if they exist and made the regex more strict (and no longer contains unbound searches)

This pullrequest also contains the chaines for #4

Copy link
Owner

@djmattyg007 djmattyg007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for putting together PHPUnit test cases. I'm incredibly thankful for your contribution. There's a few things that need updating, but they're mostly small things. All of the logic changes look fine to me.

@@ -412,16 +412,14 @@ public function getBody($returnType = self::PLAINTEXT)
}

// there could be more than one boundary
preg_match_all('!boundary=(.*?)[;$]!mi', $this->emailRawContent, $matches);
preg_match_all('/boundary=(?:|")([a-zA-Z0-9\(\)_\/+-]+)(?:|")(?:$|;)/mi', $this->emailRawContent, $matches);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please outline exactly what was wrong with the previous regex? This definitely looks more comprehensive than the previous one and more correct, but I'd appreciate it if you could outline some cases where the old regex didn't work but the new one does.

Also, can you please update the comment on line 414 to include the detail in the comment that was on line 419 but has been removed by your changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [;$] part didn't seem to work on any of the test cases (at least in my php7 installation)

@@ -0,0 +1,104 @@
<?php
declare(strict_types = 1);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I love PHP 7 and its strict typing, technically this library still supports PHP5. I'm fine with dropping support for anything prior to PHP 5.6, but ideally I still need to be able to run the tests on PHP 5.6 for the foreseeable future. Could you therefore please remove this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, I copied that straight from the phpunit docs without thinking.

class PlancakeEmailParserTest extends TestCase {

public function testSubject() {
foreach (glob(__DIR__ . '/emails/*.txt') as $testFile) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace placement and indentation needs to follow the existing conventions of this codebase (which generally follow PSR-2).

foreach (glob(__DIR__ . '/emails/*.txt') as $testFile) {

$answerFile = str_replace('.txt', '.yml', $testFile);
$answers = \Symfony\Component\Yaml\Yaml::parse(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a use statement for Symfony\Component\Yaml\Yaml at the top of the file.

@@ -0,0 +1,2 @@
<?php
require_once(__DIR__ . '/../vendor/autoload.php');
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this style of writing paths:

require_once(dirname(__DIR__) . '/vendor/autoload.php');

All Polish diacritics included ;-) tag also included for additional tests.
It should look like in the picture attached.
Best regards
Pawel
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please ensure that all of these files have a trailing newline.
See here for context: https://unix.stackexchange.com/questions/18743/whats-the-point-in-adding-a-new-line-to-the-end-of-a-file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't give them newlines since that line is included in the multiline field, making the test fail.

Fixed it with another syntax in yaml.

@djmattyg007
Copy link
Owner

Also, could you please close the other PR in favour of this one.

@MartijnBraam
Copy link
Author

I changed all the points in your review.

@djmattyg007 djmattyg007 merged commit 2c79883 into djmattyg007:master Feb 22, 2017
@djmattyg007
Copy link
Owner

I've merged all of your changes. Thanks for your contribution! I'll be issuing a new release within the next 24 hours with your improvements.

@djmattyg007
Copy link
Owner

I've just released v4.0.0 https://github.com/djmattyg007/official-library-php-email-parser/releases/tag/4.0.0

Two of the subject-parsing test cases were failing on my machine, as I have the IMAP PHP extension installed and enabled. It turned out that not using the IMAP extension produced incorrect results for any encoded non-ASCII text. As a result, I've made the decision to require the IMAP extension to ensure correct, consistent results across all environments. I apologise if this requires you to make changes to your runtime environments, but I figured it was best not to persist with a solution that could produce two different results in two different environments due to what amounts to a misconfiguration.

@MartijnBraam
Copy link
Author

I chosen this class because I couldn't install the imap extension. Wouldn't it be better to just fix the encoding issue?

@djmattyg007
Copy link
Owner

The issue isn't with the input (which is perfecrly valid). The iconv_mime_decode function quite simply produces incorrect results for text with various diacritics, and my preference is that this library cannot produce incorrect results under any circumstances.

I've opened an issue at the Symfony Polyfill repository to see if anyone is aware of a proper polyfill: symfony/polyfill#83

A possible interim solution may be to allow forcing the undesirable behaviour by making the user set a variable on the object. What sort of override would suit you best?

@djmattyg007
Copy link
Owner

It looks like an override is unnecessary. The mb_decode_mimeheader() function works perfectly as a replacement, so I'll be pushing a new version shortly that uses that instead.

@djmattyg007
Copy link
Owner

I've just pushed a new version that removes the dependency on the IMAP PHP extension https://github.com/djmattyg007/official-library-php-email-parser/releases/tag/5.0.0

Given you started off by using PHP7 syntax in the PHPUnit test, I hope the fact that I've dropped support for PHP <= 5.5 won't be a problem.

I've also set up CI on Travis: https://travis-ci.org/djmattyg007/official-library-php-email-parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants